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Abstract 

In a matroid secretary problem, one is presented with a sequence 
of objects of various weights in a random order, and must choose ir- 
revocably to accept or reject each item. There is a further constraint 
that the set of items selected must form an independent set of an as- 
sociated matroid. Constant-competitive algorithms (algorithms whose 
expected solution weight is within a constant factor of the optimal) 
are known for many types of matroid secretary problems. We examine 
the laminar matroid and show an algorithm achieving provably 0.053 
competitive ratio. 

1 Introduction 

In the classical secretary problem, one interviews n secretaries sequentially 
in random order, each order having equal probability. As soon as one in- 
terviews a secretary, one learns the skill level of that secretary, relative to 
all previously seen applicants. At this point the interviewer must make an 
irrevocable decision whether or not to hire. The goal is to hire the best 
secretary. 

For this problem, [TJ,[2],[3] discuss the elegant optimal algorithm. This 
algorithm looks at the first j secretaries, rejects them all, and then from 
among the remaining secretaries chooses the first one who is better than 
each of the first observed ^ secretaries (if any). This simple algorithm hires 
the best secretary with probability i. 
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One of the many generalizations of the secretary problem is called the 
matroid secretary problem. Here, we are given a matroid £DT(£/,Z) (which 
is known completely beforehand). The ground set also contains weights for 
each element, which are unknown a priori. The elements arrive one by one in 
a random order. We denote this ordering by tt, a permutation on n elements. 
Each element reveals its weight when it arrives. As before we must make an 
irrevocable decision whether to accept or reject the element when it arrives. 
The goal is to choose an independent set of the largest weight. 

A matroid is a particularly attractive setting for the secretary problem, 
because of the exchange property. This ensures that even if we make a 
bad decision about which element to accept, we are not locked in to a bad 
solution set. In matroid secretary problems, as opposed to more general 
secretary problems, we can often find a solution set which is relatively close 
to the optimal one. 

The matroid secretary problem can be viewed as a simple model for irre- 
vocable decisions in the presence of uncertainty as to future opportunities. 
The use of random permutation is conceptually simple, but allows powerful 
bounds with a minimum of auxiliarly information. Other models which may 
include prior distributions on the price structure are possible. 

For secretary problems, we define the competitive ratio to be the ratio 
of the expected weight obtained by our algorithm, divided by the optimal 
weight. We note that in the classical secretary problem, one has a 1/e chance 
of choosing the best applicant; for the matroid secretary problem, we do not 
care about the probability of selecting the largest-weight independent set 
from the matroid, only in selecting sets which have large weight on average. 
Furthermore, we do not need any probability of obtaining a large-weight set 
(other than is implied by Markov's inequality). 

For general matroids, [5 J gives an 0(ylogr)-competitive algorithm where 
r is the rank of the matroid. For many special classes of matroids, constant- 
competitive algorithms are known. In particular, [6] provides the a 16 q 00 - 
competitive algorithm for laminar matroids. An alternative algorithm has 
been demonstrated in [4j, which gives a 0.070-competitive algorithm for the 
laminar matroid. 

We improve the algorithm of JB| and obtain a tighter analysis, showing 
a 0.053-competitive algorithm for the laminar matroid. This improves on 
[6] by nearly 300-fold. This nearly brings the algorithm of [6] to parity with 
the new algorithm of [3]. 
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2 Definitions and Notation 



We let U be the ground set and w : U — > R be the weight function. Then a 
laminar matroid is defined by a family T of laminar subsets. That is, for any 
A, B £ J- we have A C B or S C A or .An 2? = 0. In other words, the sets in 
J- are nested within each other. Each set A £ J 7 has an associated capacity 
fj.(A). A set X C U is an independent set in the matroid iff \X n A| < /(x(A) 
for all A £ J 7 . 

Without loss of generality we may assume ^(A) < n(B) for any A,B G T 
and iCB; for, otherwise A is redundant and may be removed from T . 

We use the terminology of [6]. For i £ U, we let M(i) denote the 
minimal set B £ T such that i £ B. We say that B\ £ T is a child of 
B2 £ J- \i B\ C- B2 and there exists no intermediate set B' £ T such that 
i?i C B' C B 2 . Naturally -B2 is called parent of £?i. 

For any A, B £ T such that A C 5, we define Chain [A, -B] to be the 
sequence of sets in T starting with A and ending with B where each set is a 
child of the following set. In order to denote all sets in T that contain i, we 
may interchangeably use Chain[M(i), U] or J~{i). To save notation, let OPT 
denote the optimal solution itself or the total weight of the optimal solution 
depending on the context. For any V C U and B £ F, let OPTy(B) denote 
the optimal feasible solution that can be obtained from VDB. For simplicity 
of notation, let OPT(S) = OPTjj(B). Let tt denote the random ordering of 
elements in U. 
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3 Algorithm 



Algorithm 1: KickNext Algorithm 
Let Draw t ~ Binom(n, 1 — p) and let S = {tt(1), . . . ,Tt(t)}. 
foreach B € F do 
| let R(B) <- OPTs(B) 
end 

foreach i & T = U — S ( taken in the random order n ) do 
foreach B G Chain[M (i) , U] do 

if R(B) 7^ and is greater than some element of R(B) 
then 

Add i to SOL(B) 

Remove the largest element of weight less than w(i) from 
R(B) 
else 

| break the loop (go to the next item i) 
end 
end 
end 

Return SOL(U); 

Here, we take the first 1—p proportion of items for the sampling phase 
(used to estimate statistical information about the optimal solution), and 
we take the latter fraction p to actually build the optimal solution. As we 
will see, the optimal choice of p is about p « 0.08. From the sampled set of 
elements S, we calculate OPTs(B) as the reference set R{B). 

We denote the S = {vr(l), . . . , 7r(i)}. Such elements are used for sampling 
and building statistical information about the optimal set. The remaining 
items T = U — S are considered for actual selection. 

Note that this algorithm does not use the "Addlt" method used in [6] , in 
which during the second phase items enter the optimal solution with some 
probability less than one. The intuitive explanation for this difference is 
that any element which is not eligible for the optimal solution should be 
used to build statistical information, and not simply discarded. 

We will briefly explain the intuition behind this algorithm. In the initial 
sampling phase, we build up a set which looks like the globally optimal 
solution; in the second phase, we try to mimic the sample optimum as closely 
as possible. The rule for evicting elements from R{B) appears strange, in 
that it would be more natural to remove the lowest-weight element from 
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R{B) when inserting a new element. However, if we did this, then for low- 
weight elements R(B) would become distorted compared to OPT(B). The 
key innovation of [6j was in using this counter-intuitive eviction rule. 

4 Analysis 

Note that the algorithm selects an element by kicking out a smaller element 
in R(B) for all B G J~(i). An element i is not selected to be in SOL(B) 
iff all elements with weight smaller than Wi in R(B) have been kicked out 
already. 

We assume that, at the end of the sampling phase, we have \OPTs{E)\ = 
fj>(B) exactly for all B G T . We can force this to occur with probability one 
by adding infinitely many elements of infinitesimal weight to the matroid, 
which will not affect the algorithm's behavior. This simplifying assumption 
allows us to avoid some corner cases. 

Finally, we assume that items have distinct weights; this can be achieved 
by adding infinitesimal perturbations to the original weights. This affects 
the behavior of the optimal algorithm only infinitesimally. The perturbation 
may affect the behavior of this algorithm substantially, as it is based on 
determining hard cut-off values for whether to accept an element. However, 
it will suffice to show a good competitive ratio on the perturbed weights. 

For a given set B G P, most elements x G T will be immediately dis- 
qualified from affecting B in any way. We can note a simple condition on 
element x G B affecting the set SOL(B) is that the weight of x exceeds the 
smallest weight element of OPT s(B'), for all B' in the chain between M(x) 
and B. We call such elements qualifying for B. We can bound the number 
of such qualifying elements as follows: 

Lemma 4.1. Consider any set B G T and element i G U. Let OPT${B) = 
{cti, ct2, • • • , a m } sorted so that w(a%) < w{a,2) < ••• < w{a m ). For no- 
tational convenience, set w{a m j r \) = oo. Let Nj C T, for j = l,...,m, 
denote the elements x which satisfy the following conditions: 

1. x qualifies for B 

2. w(a,j) < x < w(aj+\) 

3. xeT 

Then for any non-negative integers m, . . . , n m , we have 

P(\Ni\ = m A ■ ■ ■ A |iV m | =n m \iiS)< p »i+~+nm 
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Proof. It suffices to show that, for any j = 1, . . . ,m, the probability that 
\Nj\ = rij, conditional on i £ S as well as (A^+ij = rij + i, . . . , \ N m \ = n m , is 
at most p nj . 

Note that Nj is determined solely by the elements of weight less than 
w(cij + i). Suppose we condition on some choice of Oj+i, . . . , a m . Now Nj 
depends solely on the positions of elements with weights less than w(aj + ±), 
and in particular is independent of Nj+i, . . . , N m . Then dj is the element 
of U satisfying the five conditions: 

1. dj ^ i 

2. w(a,j) < w(a,j + \) 

3. {dj, cij+i, . . . a m } G X 

4. dj £ S 

5. a,j has maximal weight among all that satisfy (1) — (4). 

(Condition (1) is redundant, as i ^ S and d\, . . . ,a m G S.) We now 
claim that any qualifying element i / i such that w{x) < w(dj+±) must 
satisfy {x, dj + \, . . . , a m } G X. For, suppose x violates some (i(B') = k, for 
B' C B. Then this implies that among {cij+i, . . . , a m } there are exactly k 
elements in B' . In particular, x does not qualify for B' C 5. 

Now consider the set X C U consisting of all elements x which satisfy 

x ^ i, w(x) < w(dj + i),{x, dj+i, . . . , a m } G X. 

As we have seen, dj is the element of X n S 1 of largest weight and rij is the 
number of elements of X of greater weight than dj. 

If \X\ < rij, then the probability that \Nj\ = rij is zero. Otherwise, we 
can view this as the following process. Suppose we sort the elements of X 
in order of decreasing weight. Starting with the largest element of X, we 
assign elements to either S or T. These assignments to S are independent 
with probability 1 — p. Then \Nj\ = rij iff we assign the first rij elements to 
T (probability p) and the {rij + l)th (if it exists) to S, which occurs with 
probability at most p n i . 

Hence, conditional on any Oj+i, . . . , a m , the probability that \Nj\ = rij 
is at most p n i . 

□ 
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4.1 Probability of selecting an item 

Define the backward rank of element i for B G J 7 , denoted as brank(i, B), to 
be the number of elements in OPT(B) having weight less than Wi. Similarly, 
let brankg(i,i?) be the number of elements is OPTs(B) having weight less 
than Wi. It can be easily seen that branks(i, B) > brank(i, B). Furthermore, 
if % G T, then branks(z, B) > brank(i, B) + 1 (proved in [6]). Intuitively, an 
element % is more likely to be picked by the algorithm if its brank(z,i?) is 
large. 

Now, when element i G T is considered for inclusion in the solution set, it 
will be rejected iff there is some B G J~(i) such that all elements in OPT${B) 
of weight less than w(i) have been evicted already. Let AllKicked(i, B) 
denote this bad event. We can bound the probability of this event as follows. 

Lemma 4.2. Suppose p < 1/2. Consider any B G T and i G OPT. Now if 

we define 

= , p + (i-p) iog(i -pK 

a [ 2(1 -p)p 2 ' 
c = 4p(l — p) 

Then we have 

^^brank(i,B)+l 
P(ALLKlCKED(i, B)) < 

Proof. Fix some % G OPT and let brank(z, B) = d. All the probabilities we 
calculate in this proof are conditioned on i ^ S; we no longer specify this 
explicitly to simplify the notation. 

Let OPTs(B) = {ai, . . . ,a m } sorted so that w{a\) < w{a2) < ■ ■ ■ < 
w(a m ). Because of the KickNext rule, the item i will go into SOL(B) unless, 
for some / G {d + 1, . . . , m}, there have been at least I items of weight less 
than w(ai + i) added to SOL(i?) before it. 

Now consider an element i! ^ i. In order for such an i! to have been 
added to SOL(£?) before i, the following events must have occurred: 

1. i! is qualifying for B 

2. i! comes before i in the ordering tt 

We view the suffix of the permutation n corresponding to T as generated 
by the following process. Each element x G T chooses p{x) uniformly at 
random from the real interval [0, 1]. We then form the suffix of ir by sorting 
by p. Suppose we condition on a fixed value of r = p{i). Now consider an 
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element i' 7^ i. In order for such an i' to have been added to SOL(B) before 
i, the following events must have occured: 

1. i' is qualifying for B 

2. p(i') < r. 

Let Qi denote the number of qualifying items other than i with weight 
< w{ai + \) and let A\ denote the number of such items which also have 
p(i') < i. We wish to estimate the probability Ai > I. 



By Lemma 4.1 the random variable Qi is stochastically dominated by 
the sum of I independent geometric-p random variables. Given a fixed value 
for Qi, each such qualifying item i' has a probability r of occuring before 
i. Furthermore, these events are independent (conditional on r). Hence 
the probability P(Ai > l\Qi = k, i ^ S) is at most the probability that a 
binomial random variable, of k trials and probability r, exceeds /. In effect, 
the random variable A\ is formed by conjugating a negative binomial random 
variable Qi with a binomial-r distribution. The binomial distribution is a 
conjugate prior for the negative binomial, hence the distribution of A\ is 
stochastically dominated by the negative binomial distribution of probability 

= rp 
y 1 —p+rp ' 

We now wish to estimate the probability that Ai > I. For a negative 
binomial random variable A' l} the event A[ > I is equivalent to the situation 
that we flip a biased coin for 21 — 1 times, where the probability of success 
is q, and the total number of successes is at least than I; this is a binomial 
tail probability. Hence we have 

P(Ai > I I p(i) =r)< P(Binomial(2/ -l,q)>l-l) 

Note that as p < 1/2, we have q < 1/2 as well. By the Chernoff bound 
the probability of such a deviation is exp(— (21 — ljRelEntfgpT-H?))- Here 
RelEnt is the relative entropy function, given by 

1 — x 

RelEnt (x| I y) = x \og(x/y) + (1 — x) log ' 



1-2/' 



We can simplify this as 



P(A t > l\p(i) =r)< exp(-(2/ - l)RelEnt(^-||<0) 

1-1 v- 1 ( I 



(2l-l)(q-l)J \ q-2lq 



S 



Integrating over r 6 [0, 1] gives 



< (4p(i - p)f 



dr Aq(l — q) 
2-2q 4p(l - p) 



= ac l 

We use the union-bound for the event AllKicked(z, B): 

oo 

P(ALLKlCKED(f,S)) < ^ P ( A l > l \iiS) 

l=d+l 

oo 

< ^ 
l=d+l 

ac d+1 

< 



1 - c 

□ 



4.2 Expected weight of SOL 

We cannot take any arbitrary element of the optimal solution and show that 
it is selected with a good probability by our matroid secretary algorithm. 
Instead, we use a similar strategy to the uniform matroid, and examine 
the set of high-scoring elements collectively. We show that most of these 
elements (but not any particular one of them) are selected high probability. 

We contrast our approach with that of [6] , which adopted a hybrid proof 
strategy between fully analyzing the collective behavior of the optimal so- 
lution, and analyzing individual elements of the solution. In [6j, certain 
elements in the optimal solution were identified, referred to as "good" ele- 
ments, which were shown to have a high probability of being selected by the 
secretary algorithm. This type of analysis is inherently not tight. We will 
instead determine the worst possible arrangement of the optimal solution, 
and show that it still is selected with high probability. 

We use our upper bound on the probability of the event AllKicked to 
obtain a lower bound on the expected weight of our solution: 

E[w(SOL)] > w ( i ) [Probability that i G SOL] 

iSOPT 
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> ^2 w ( i ) x P x [Probability that i G SOL|i ^ 5] 
ieOPT 

> w(i)xpx[l- P(AllKicked(z, B))] 

iGOPT BeT(i) 



> p 



> p 



Q , c l+brank(i,S) 

Y ">(*)- E Yl — 

igopt igoptbgJ 7 ^) 

^(OPT)-^- J] «,(*) J] c l+brank(*,£) 
° iGOPT B6J(i) 



In order to use this estimate, we need to obtain an upper bound on the sum 

Y w{i) Y c 1+brank (^) 

iGOPT BeF(i) 

The presence of the weight w(i) complicates things, so as a preliminary 
we consider the unweighted version of this sum. 

Let OPT|^ rge (i?) denote the m largest elements in OPT(B). 

Lemma 4.3. Let B G T and let m > be an integer. Define g(m, B) by 
g{m,B)= E c 1+brank ^ 

ieOPT% rge (B) B'eChain[M(i),B] 

Suppose c < 1/2. Then 

g(m,B)<^-\OPTZ rge (B)\. 

Proof. For each integer i define q = c + c 2 + • • • + c 1 , and define Cqo = . 

1 — c 

We will need to show a stronger bound, specifically that for all B G T 
and all m > we have 

g(m, B) < 2ci H h 2c m _i 

where fc = /u(-B) > m. 

We will show this by induction on the capacity k. Note that for a given 
value of k, we are proving the inductive hypothesis simultaneously for all 
B G T and all possible values of m. 

We view the laminar family as consisting of levels, corresponding to 
each possible value for the capacity. When computing g(m,B), we have 
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the contribution at level k itself, as well as the contribution from the lower 
levels. Let B±, . . . , Bj be a coarsest ^-"-partition of B (other than B itself). 
Let X = OPT^ rge (B), and let = |Bj n X\ and h = //(Bj) for each 
i = 1, . . . ,j. For each i we have X n Bi = OPTJ^* (Bj). By the capacity 
constraints we must have < fcj < fc for each i. 
By laminarity we have 

«7(m, B) = ^ c 1+brank ( l ' B ) + ff (mi, Bi) + ■ ■ ■ + g{m h Bj) 
iex 

The elements of X have maximal bottom-rank in X. Hence the term 
T, ie xnB c 1+brank ( i > B ) = c fc + . . . c fc - m+1 = c m c k - m . Each B, has rank less 
than k so we apply the inductive hypothesis and obtain 

j 

g(m, B) < c m c k ~ m + ^2 Cl + ■ ■ ■ + 2c 

i=l 

The right-hand side is a convex function mi, ... ,rrij, hence it attains its 
maximum when these are set to their most extreme possible values. When 
m < k strictly, we may set j = 1, mi = m, k± = k — I; when m = k, we may 
set j = 2, fci = &2 = fc — 1, mi = m — 1, m2 = 1- In the first case, we obtain 

g(m, B) < c m c k ~ rn + ^2 Cl + ■ ■ ■ + 2c 

171,-1 + c irii + c miC(k-l)-mi 

i 

— c m c + 2ci + • • • + 2c m _i + c rn + c m Cfe_i_ m 
= 2ci H h 2c m _i + Cm + c m (c fe _i_ m + c fc_m ) 

— 2ci + • • • + 2c m _i + c m + c m Cfc_ m 

In the second case, we obtain 

g(m, B) < c m c fc_m + 2ci H h 2c m _ 2 + c m _i + c m _ic (fc _ 1 )_ (m _ 1 ) + ci + cic (fc _ 1) _ 1 

= c m + 2ci + • • • + 2c m _2 + + Ci + CiC m _ 2 

= 2ci H h 2c m _ 2 + C m _l + C m + C + CC m -2 

= 2ci H h 2c m _i + c m 

— 2ci + • • • + 2c m _i + c m Ck— m 

as claimed. □ 

Next we use this unweighted bound to bound the weighted sum: 
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Lemma 4.4. If c < 1/2 we have 

w(i) c 1+brank ^ < —w(OPT) 

ieOPT BeT(i) C 

Proof. Sort the elements of OPT by weight so that w(x\) > w(x2) > ■■■ > 
w(xi). Define Cqo = as above. Then we have 

^ w(i) £ c 1+brank (^) 
ieOPT B&F(i) 

= w{xi)g{U U) + {w{xi-x) - w{xi))g{l - 1, U) + • • • + (w(x 2 ) - w(xi))^(l, C/) 

< w(xi)2lcoo + (w(xi-i) - w(xi))2(l - l)coo H h (w(x 2 ) - w(xi))2c 00 

= 2coo(w(xi) + w(xi-i) + . . . io(o?i)) 
= 2 Coo w{OPT) 

□ 

We consider the contributions to SOL of the elements of OPT. 
Theorem 4.1. The expected value of the weight of SOL is at least a factor 

2ac 

Proof. 



P(l _ (TT^r) °/ optimal. 



E[w(SOL)\ > Y w(i)p(l- Y P(AllKicked(«, B))) 
ieOPT BeJ- 

> P ( E ^(o^c brani 



ieOPT BeJ"«eOPTnB 

> p(w{OPT) - 2 Coo w{OPT)) 

1 — c 

= to(OPT)p(l 



□ 

Theorem 4.2. T/ie KickNext algorithm achieves a competitive ratio o/0.053 
Proof. Set p = 0.08 and apply Theorem |4.1| □ 
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